Different Ways of Gaining Inferences
In the lecture component, we discussed the differences between Frequentist Inferences and those that fall under a Bayesian inference. As a brief reminder,
Frequentist approaches assume:
Bayesians assume:
For this in-class activity, we will be using the Araptus attenuatus beetle data we looked at way back in the discussion on Data Containers.
Figure 1: The enigmatic sonoran desert euphorb bark beetle, Araptus attenuatus (not to scale).
These data consist of 33 sites (see map below) that were sampled of the beetle from 10 host plants and the sex of all randomly sampled individuals was recorded as was the coordinates for the site and a measure of habitat suitability based upon ecological niche modeling (see Dr. McGarvey’s course on this topic-it is awesome!).
I would like you to use these data to make some inferences using both a Frequentist Approach (via a t-test) and a Bayesian approach (using the binomial). But first, let’s generate some understanding of the data.
We are going to use the sex ratio of this organism as a proxy for how animals choose where to live. The life history of this organism is one that male beetles colonize new habitat and try to attract females to them. If habitat quality does not impact the ability of a female beetle to decide where to go, then we’d expect there to be no relationship between habitat suitability (in our measure larger numbers are better). Conversely, if there is an impact by habitate and female beetles are actively choosing where to go, then it should be evident in this relationship.
Create a variable in the data that represents the reproductive sex ratio in this species at each locale. Using words, describe how you did it as if you were writing a sentence or two in a ‘Methods section’ of a paper.
Visualize this ratio using a plot to evaluate the relationship of sex ratio with habitat suitability.
In your own words, interpret this relationship. Is there a trend? Is it linear or non-linear?
A recent graduate student was interested in looking at sex ratio as a proxy for disperal. If male individuals are sessile because they have established at a site, then any changes in sex-ratio should be from female preference and potentially differential dispersal ability. Their hypothesis was that for marginal habitat (e.g., that where suitability was less than 0.25), observed sex ratios should be skewed from 1:1.
Use the sample of sites with marginal habitat and test them under the hypothesis that the average sex ratio \(H_O: \mu = 1.0\) and interpret the results in textual format using the statistical output.
How deviant does the observed sex ratio have to be before we reject the Null Hypothesis given this output?
Here we want to know more about how variable the sex ratio can be and still appear to be even. We will adopt a Bayesian approach to this one that is similar to that we demonstrated in the catfish example.
On average, there are 48 samples (male and female beetles total) per site and produce a sex ratio as you defined above. Let’s change this to a probability of sampling a single sex of beetle (your choice) by creating a new column of data that is the probability of selecting \(P(x|data) = N_x/(N_x + N_y)\).
Define the probability of randomly selecting one of the sexes for each locale in the data set and then examine the distribution of these probabilities across all sites. What does it look like?
What is the mean of this distribution? And what is the standard deviation? Create these variables and we can use them as input into a prior probability model as follows.
If I found that the mean probability was \(\mu = 0.72\) and the standard deviation was \(\sigma = 0.08\), I could pull random numbers from a normal random distribution as:
[1] 0.6076744
No, you define the values of mu and sd from the data.
Now, we have prior data input into our model. Let’s define the underlying generative model. Here the liklihood of male and female is defined in the priors and since there are only two mutually exclusive conditions, this is just what a Binomial Model does—lucky for us!
So, in the previous example, we had a hypothetical prior of \(p_{prior} =\) 0.6077. Using the binomial model, we could find a random sample with this prior for some hypothetical sampling session of 50 beetles to find the number of one sex as:
[1] 35
Meaning that if the true probability of one sex in the data is \(p_{prior} =\) 0.6077, then under these situations, we should be able to sample real populations and find 35 of the identified sex.
So let’s look at the confidence we have now, given our observed data, on the value of the sex ratio. Let’s examine the sites we have that are in the marginal habitat.
What is the expected number of individuals in marginal habitats for the specific sex IF the sample size was 50?
Create a vector of random priors as we did above but this one has 10,000 random values instead of one.
Create another vector that has the expected number of observed individuals with each of these priors.
Put these together into a data.frame and figure out what prior probabilities can produce the number of individuals you identified as the targeted sex in the marginal habitat. Does this contain 1.0? What does this say about the sex ratios from the marginal habitats you saw in the real world?